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Abstract 

The  matrix  D  —  kI\n  polynomial  approximations  of  order  N  is  similar  to  a  large  Jordan 
block  which  is  invertible  for  nonzero  k  but  extremely  sensitive  to  perturbation.  Solving  the 
problem  (D  —  kl)f  =  g  involves  similarity  transforms  whose  condition  number  grows  as  AT!, 
which  exceeds  typical  machine  precision  for  N  >  17.  By  using  orthogonal  projections,  we 
reformulate  the  problem  in  terms  of  Q,  the  pseudo-inverse  of  D,  and  therefore  its  optimal 
preconditioner.  The  matrix  Q  in  commonly  used  Chebyshev  or  Legendre  representations 
is  a  simple  tridiagonal  matrix  and  its  eigenvalues  are  small  and  imaginary.  The  particular 
solution  of  (/  —  kQ)f  =  Qg  can  be  found  for  all  real  k  at  high  resolutions  and  low  compu¬ 
tational  cost  (0(iV)  times  faster  than  the  commonly  used  Lanczos  tau  method).  Boundary 
conditions  are  applied  later  by  adding  a  multiple  of  the  known  homogeneous  solution.  In 
Chebyshev  representation,  machine  precision  results  are  achieved  at  modest  resolution  re¬ 
quirements.  Multidimensional  and  higher  order  differential  operators  can  also  take  advantage 
of  the  simple  form  of  Q  by  factoring  or  preconditioning. 
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1  Introduction 


In  many  situations  it  is  nacessary  to  solve  a  family  of  differential  equations  of  the  form 
•^kf  =  9  where  Ak  is  a  differential  operator,  fe  is  a  parameter,  and  /  €  D{Ak)  and  g  e 
are  functions  of  the  independent  variable  s  €  fi.  A  common  example  of  this  problem  is  the 
differential  equation 

{lx  ~  *')  ^ 

where  k  is  real  and  the  absolute  value  of  k  is  bounded  by  a  possibly  large  number  K.  The 
domain  on  which  the  differential  equation  holds  is  a  finite  interval,  say  x  G  (—1, 1),  and  the 
Dirichlet  boundary  condition  f{xb)  =  0  is  applied  at  the  boundary  xt  G  {il}.  Since  the 
problem  admits  the  homogeneous  solution  fh{x)  =  which  can  rapidly  grow  in  the 

sgn(ifc)  direction,  it  is  essential  to  apply  the  boundary  condition  at  the  proper  side  of  the 
domain  (asj  =  8gn(fc))  in  order  to  have  a  well  posed  problem. 

Our  choice  of  the  model  problem  (1)  is  motivated  by  physical  applications,  but  also  by 
the  fact  that  the  polynomial  approximation  of  (1)  has  the  form  of  a.  Jordan  block  matrix. 
Since  any  matrix  can  be  written  as  A  =  S  +  N  where  S  is  diagonalizable,  N  is  nilpotent 
and  SN  =  NS,  the  correspondence  between  (1)  and  its  polynomial  approximation  is  of 
fundamental  importance.  The  well  known  numerical  difficulties  with  Jordan  blocks  of  a 
matrix  are  reflected  in  solving  the  problem  (1) . 

We  shall  show  that  continuous  and  discrete  versions  of  the  problem  (1)  may  be  replaced 
by  a  reformulated  optimally  preconditioned  problem  whose  solution  involves  operations  with 
simple  tridiagonal  matrices.  The  reformulated  problem  leads  to  a  solution  procedure  which 
is  very  efficient,  numerically  stable,  and  accurate  to  machine  precision  (e  Rs  2.22  X 
when  double  precision  is  used)  in  Chebyshev  polynomial  representation  at  modest  resolution 
requirements  which  grow  slowly  with  |A:|. 

1.1  An  example 

In  cylindrical  geometry,  the  Laplacian  operator  may  be  written  as 


Taking  the  Fourier  transform  in  the  9  direction,  we  may  write 


where  k  is  the  circumferential  wavenumber  and  T  and  B  are  operators  of  the  form  (1)  with 
X  =  log(r).  This  factorization  of  the  Laplacian  arises  in  the  context  of  applying  the  exact 
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artificial  boundary  conditions  [6].  To  prohibit  the  growing  mode  as  r  — >■  oo,  one  must  have 
that 

d 


dlog(r) 


+  1^1 


h{r) 


=  0 


(4) 


•=B 


at  the  limit  r  =  of  the  finite  computational  domain.  This  boundary  condition  is  nonlocal 
and  it  is  best  applied  in  the  Fourier  domain.  To  solve  the  problem 


vV 


-w 


(5) 


with  the  boundary  conditions  0|r=i  =  0  and  ^  0  as  r  — ^  oo,  assume  that  the  support  of 

w  is  contained  within  the  computational  domain  of  radius  R.  Given  that  w  =  0  outside  the 
computational  domain,  the  boundary  condition  (4)  applied  at  r  =  is  the  exact  equivalent 
of  the  boundary  condition  at  infinity.  The  solution  in  the  Fourier  domain  takes  three  steps 
and  may  be  written  as 


Cib  =  (6) 

6  =  where  ffc(r)|r=R  =  0  (7) 

where  V’fe(r)|r=i  =  0  (8) 

where  the  inverses  of  the  operators  T  and  B  are  taken  by  applying  the  indicated  boundary 
conditions.  Therefore,  this  numerical  solution  depends  on  solving  the  equation  (1). 


2  Polynomial  approximation 

Our  aim  shall  be  to  solve  this  equation  numerically  on  function  spaces  consisting  of  polyno¬ 
mials  of  degree  at  most  N .  Let  D  be  the  discretized  representation  of  the  derivative  operator 
on  the  (AT  +  l)-dimensional  space  Pm  of  A^-th  order  polynomials,  and  let  us  write  the  original 
equation  as  follows: 

iD-kI)f  =  g  (9) 

where  /  and  g  are  vectors  representing  the  functions  f{x)  and  ^(a:)  in  Pm-  The  boundary 
condition  also  has  to  be  satisfied,  but  as  we  shall  see,  this  leads  to  an  overdetermined  problem 
whenever  it  ^  0.  In  this  regard,  polynomial  approximations  are  intrinsically  different  from 
the  original  continuous  problem. 

There  are  two  difficulties  that  have  to  be  resolved.  First,  each  differentiation  reduces 
the  degree  of  the  polynomial,  cascading  all  polynomials  in  Pm  to  zero  in  at  most  AT  +  1 
differentiation  steps.  Therefore,  D  is  a  nilpotent  operator  with  the  characteristic  polynomial 
x(s)  =  which  is  also  its  minimal  polynomial.  It  follows  that  for  all  fc  ^  0  the  ma¬ 

trix  D  —  hi  is  invertible  and  our  representation  has  a  unique  solution,  which  is  in  general 
incompatible  with  the  boundary  condition. 

Since  =  0,  we  can  write  this  solution  exactly  as  a  finite  sum  when  fc  0: 
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This  equation,  while  exact,  is  extremely  poorly  conditioned  for  large  AT;  this  constitutes 
the  second  difficulty  [5].  By  changing  the  polynomial  basis  functions  from  a:"  to  a:”/n!, 
we  see  that  the  above  formula  defines  an  upper  triangular  Toeplitz  matrix  T  with  entries 
for  j  >  t.  However,  the  similarity  transform  between  the  basis  functions 
s"  and  a:"/re!  has  the  condition  number  N\,  which  appears  singular  to  machine  precision 
when  N  >  17.  Therefore,  this  exact  equation  is  not  computationally  useful  in  the  context 
of  solving  differential  equations  where  N  needs  to  be  much  larger  than  just  17. 

This  problem  becomes  even  harder  when  orthogonal  polynomials  are  used.  The  condition 
number  of  the  similarity  transform  between  the  Chebyshev  polynomial  basis  and 

the  companion  form  basis  which  exceeds  machine  precision  as  soon 

as  AT  >  14. 

Both  difficulties  can  be  traced  to  the  loss  of  the  homogeneous  solution  in  the  discretized 
problem.  More  degrees  of  freedom  are  needed  to  resolve  the  diflBculties.  The  Lanczos  tau 
method  [4]  introduces  an  additional  high  order  polynomial  of  degree  AT  +  1  in  order  to 
satisfy  the  boundary  condition.  The  resulting  system  of  equations  on  Pn+i  may  be  written 
as  (A  +  uv^)f  =  (5^,0)^,  where  A  restricted  to  Pn  coincides  with  D  —  kl  and  the  outer 
product  arises  from  the  boundary  condition.  Even  though  the  matrix  A  may  have  a 
simple  structure,  A  +  ur^  is  hard  to  invert  since  one  cannot  directly  apply  the  Sherman- 
Morrison  formula  [3] 

due  to  the  fact  that  inverting  the  restriction  of  A  to  Pf^  is  ill-conditioned. 

We  propose  a  procedure  which  is  well  conditioned  and  still  preserves  the  tridiagonal  form 
of  the  matrices  involved.  This  approach  is  O(A^)  times  faster  than  the  the  tau  method  which 
introduces  a  full  row  into  the  matrix  to  be  inverted. 


+  v'^A~^u 


(11) 


3  Problem  reformulation 


The  source  of  difficulties  is  the  nilpotent  part  of  D  -  kl.  We  note  that  £>  is  a  non-normal 
operator  for  which  the  commutator  DD* — D*U  is  of  rank  2  in  companion  form  representation 
and  acts  by 


DD* 


N 

-  :  X: 

n=0 


n  AT 

a^x 


ao  - 


N! 


(12) 


SO  it  is  natural  to  use  a  preconditioner  based  on  D.  Fortunately,  the  optimal  preconditioner 
in  typical  spectral  representations  has  a  simple  form.  This  important  observation  suggests 
the  following  solution  method. 

Instead  of  using  the  Lanczos  tau  method  to  apply  the  boundary  condition,  we  shall 
construct  an  integration  operator  Q  as  the  pseudo-inverse  of  D  and  seek  a  particular  solution 
of  the  discretized  problem  rewritten  as  follows: 


(13) 


The  approximation  QD  I  will  be  justified  later,  in  section  6.  The  reformulated  prob¬ 
lem  (/  -  kQ)fp  =  Qg  has  the  solution  =  (I  -  kQ)~^Qg.  The  full  solution  of  (9)  is 
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then  obtained  as  f  =  fp  +  txfh  where  /j,  represents  the  homogeneous  solution  and 

a  =  —fpixb).  This  delayed  application  of  the  boundary  condition  allows  us  to  choose  an 
integration  operator  Q  with  optimal  numerical  properties,  which  can  be  thought  of  as  a  sur¬ 
rogate  boundary  condition.  This  two  step  solution  procedure  is  analogous  to  the  use  of  the 
Sherman-Morrison  formula.  The  reformulated  problem  now  involves  integrations  instead  of 
differentiations.  This  change  leads  to  well  behaved  numerical  implementations.  Moreover, 
we  shall  demonstrate  that  in  typical  polynomial  representations  the  matrix  Q  has  a  simple 
form  which  leads  to  efficient  algorithms. 

Integration  preconditioners  for  differential  operators  in  spectral  methods  are  tridiagonal 
for  arbitrary  classical  orthogonal  polynomial  families  [2].  The  use  of  the  pseudoinverse  of  D, 
proposed  here,  is  an  improvement  which  simplifies  analysis  and  guarantees  optimal  numerical 
results  in  each  orthogonal  polynomial  representation. 


4  The  pseudo-inverse  of  the  derivative  operator 

The  pseudo-inverse  Q  of  the  matrix  D  is  the  unique  minimal  Frobenius  norm  solution  to 
min  II  D(?  —  jTIIf-  This  condition  amounts  to  the  requirement  that  DQ  and  QD  be  orthogonal 
projections  onto  image(Z))  and  image((5),  respectively  [3]. 

To  construct  the  pseudo-inverse  of  D  we  first  note  that  ker(r>)  =  Fo  and  that  image(D)  = 
Pn-i  C  Pn-  From  the  commutative  diagram 

Pn  Pn 

nj.  (14) 

PnIPo  a  Pn-i 

where  11  is  the  canonical  projection  to  the  coset  space  Fjv/Fq,  A  is  invertible,  and  $  is  the 
insertion  map,  we  conclude  that  the  pseudo-inverse  is  given  by 

Q  (15) 

where  is  the  orthogonal  projection  to  Pn-y  and  is  the  insertion  map  whose  image 

is  F^,  the  subspace  orthogonal  to  Fp.  In  this  construction  orthogonality  plays  a  key  role. 
Each  Q  depends  on  a  chosen  inner  product  on  Fw,  which  will  be  clear  from  the  context. 

This  definition  satisfies  the  conditions  that  QD  and  DQ  be  the  orthogonal  projections 
to  image((5)  =  Pq  and  image(D)  =  Pn-i,  respectively.  The  commutator  is 

[D,  Q]  =  DQ-Qn  =  Up„_,  -  npx  =  Up,  -  Up._^  (16) 

which  is  an  operator  of  rank  2,  showing  that  Q  and  D  nearly  commute.  Orthogonal  projection 
operators  IIv  to  various  subspaces  V  will  be  appear  often  in  the  following  discussion. 

Let  {pn}^o  orthogonal  basis  polynomials  (not  necessarily  normalized)  such  that 
deg(pr,)  =  n.  The  matrix  form  of  D  is 


D  = 


Ojvxi  ^ 
Olxl  OlxAT 


(17) 
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and  therefore 


Q  = 


OixAf  Oixl 

^  ^  0/Vx1 


(18) 


This  gives  us  a  simple  method  of  constructing  Q.  For  example,  if  the  polynomials  Pn{x)  = 
x^/n]  are  defined  to  be  orthogonal,  D  is  reduced  to  its  companion  form  where  A  =  /  and 

Q  =  £>^. 

For  the  metric  generated  by  the  Chebyshev  polynomials  p„(a:)  =  r„(a:)  =  cos(re  cos  ^(a:)), 
one  obtains  the  tridiagonal  matrix 


Q 


0  0 
1  0 
0  i 


0 

^  1 
"4 


0 


2(yv-2) 

0 


0 

1 


0 


0 


2(W-1) 

0 


2()V-2) 

0 

1 

2N 


0 

0 

0 


(19) 


while  the  Legendre  polynomials  lead  to  a  slightly  different  tridiagonal  matrix 

0  0  0  0  •••  C 

1  0  -i  0 

0  I  0  -I 


Q 


2N-5 

0 


0 

1 


0 


2N-3 

0 


2N-1 

0 

1 

2N-1 


(20) 


Numerical  evidence  indicates  that  the  eigenvalues  of  Q  in  both  Chebyshev  and  Legendre 
representations  lie  on  the  imaginary  axis.  This  favorable  distribution  of  eigenvalues  need  not 
happen  in  general.  We  conclude  that  in  all  three  cases  (companion  form,  Chebyshev  and 
Legendre  representations)  the  matrix  I  —  kQ  is  an  invertible  tridiagonal  system  for  all  real 
k. 

5  Eigenvectors  and  eigenvalues  of  Q 

Given  an  orthonormal  basis  of  Pn  ordered  by  basis  polynomial  degree,  let  us  denote  the 
components  of  a  vector  v  6  Pjv  by  uq,  ,  ^iv-  In  nny  polynomial  representation,  for 

eigenvalue  A  ^  0  the  equation  Qv  =  Xv  implies  that  the  constant  polynomial  component 
satisfies  ro  =  0  so  that  v  JL  Pq.  Thus,  we  obtain 


DQv  —  XDv 


(21) 
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where  DQ  is  an  orthogonal  projection  to  Pat-Ij  range  of  D.  Recall  that  ± 
scale  V  so  that  v  =  u  +  Pn  where  u  G.  Pn-i-  Therefore,  DQv  =:  v  —  and 

(/  -  XD)v  =  pN  (22) 


so  that  ^ 

v  =  {I-  XD)-^pm  =  E  A”D>a,  (23) 

n=0 

where  v  L  Pq  provided  that  A  is  an  eigenvalue  of  Q.  This  exact  equation  defines  the 
eigenvectors  once  the  nonzero  eigenvalues  of  Q  have  been  obtained,  but  it  is  very  sensitive 
to  perturbations  in  A  due  to  the  same  numerical  difficulties  as  the  equation  (9).  Conversely, 
eigenvalues  A  are  very  sharply  defined.  Our  aim  shall  be  to  determine  the  form  of  eigenvectors 
analytically. 

We  note  that  the  homogeneous  solution  of  the  continuous  equivalent  of  the  eigenvector 
equation  (22)  is  a  highly  oscillatory  function  for  small  but  nonzero  imaginary  A.  One  is 
reminded  of  the  Fourier  series,  since  the  frequencies  |1/A|  are  approximately  evenly  spaced 
given  Chebyshev  representation.  The  integration  operator  Q  in  Chebyshev  representation 
may  be  thought  of  as  a  rough  approximation  of  the  Fourier  integration  operator.  This 
suggests  that  the  Jordan  decomposition  of  Q  may  be  reasonably  well  conditioned. 

A  particularly  simple  form  of  v  can  be  derived  when  N  +  1  =  2*^.  In  this  case,  the  sum 
of  A  +  1  terms  can  be  factored  as  a  product  of  only  log2(A^  +  1)  terms  so  that 


V 


n(/+(AD)»')p„. 


j=0 


(24) 


The  eigenvalue  zero  has  multiplicity  at  least  one,  since  Qp;^  =  0.  A  basis  for  the  invari¬ 
ant  subspace  associated  with  A  =  0  can  be  constructed  by  seeking  nontrivial  solutions  to 
equations  of  the  form  Q^Vj  =  0.  For  basis  polynomials  a:"/«!  which  bring  D  to  its  companion 
form,  all  eigenvalues  of  Q  are  zero,  and  Q  acts  by  mapping  z"/n!  i-»  /(ra  + 1)!  for  n  <  AT. 

For  Chebyshev  or  Legendre  polynomials  the  matrix  Q  is  related  to  the  simple  matrix 


0  0  0  0  0 
10-10 
0  10-1 


10-10 

0  10  0 

0  0010 


(25) 


by  row  or  column  scaling,  respectively.  This  matrix  also  corresponds  to  another  Q  obtained 
from  polynomial  basis  functions  which  satisfy  the  recurrence  relation  p„+i(x)  —  p„_i(x)  = 

fpn{x)dx. 
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Let  A  be  a  nonzero  eigenvalue  of  this  matrix.  The  first  N  components  Uoj  ■  ■  -  ?  ^N-l  of 
corresponding  eigenvector  may  be  then  written  as 

Vn  =  sin(»7^)  (26) 

with  the  last  component  being  =  -  coR(Ar^)ta.n(^^)i^/2.  This  follows  from  the  formula 

t  sm((77  —  1)0)  —  T  sm((n  +  1)0)  =  2i  cos(0)  sin(ra0)  (27) 

i 

and  the  requirement  that  sln(A^0)  =  0.  Nontrivial  solutions  can  be  obtained  for  0  =  mnfN 
where  m  =  1, . . . ,  AT-l  (except  that  m  ^  N/2).  The  corresponding  eigenvalues  A  =  2*  cos(0) 
all  lie  on  the  imaginary  axis.  The  remaining  2  (for  odd  AT)  or  3  (for  even  N)  eigenvalues 
are  zero,  with  the  eigenvector  pat-  Therefore,  all  eigenvalues  lie  on  the  imaginary  axis  and 
|A  <  2. 

For  Q  obtained  from  the  Chebyshev  polynomial  basis,  we  write  the  eigenvector  compo¬ 
nents  I  as  =  i~^Zn  and  require  Zg  =  Zff  =  0.  The  last  component  is  then 

v;>f  =  We  obtain  the  recurrence  relation 

Zn-l  +  Zn^i  =  —2inXZn  (28) 

whose  solutions  for  nonzero  A  are  linear  combinations  of  Bessel  functions  [1] 

Z„  =  aJ„(w)+/9r„(u;)  (29) 


where  we  have  defined  a;  =  —i/X. 

We  can  obtain  eigenvectors  v  provided  that  A  is  a  root  of  the  equation 


det 


Jo(w)  l^(w) 


=  0 


(30) 


or  equivalently 


where  Hankel  functions  =  J„(w)  +  *K(w)  are  introduced. 

Remembering  that  arg(z)  =  lm(log(2!)),  we  consider  the  imaginary  part  of  the  function 


0(u;)  =  ^logj 


41i(^)  -  4li(^) 


(32) 

(33) 


which  is  approximately  1  over  an  interval  of  width  0(Ar)  and  decays  towards  zero  for  u;  »  M. 
The  plot  of  im(0(t«;))  for  the  case  AT  =  32  is  shown  in  figure  1.  We  conclude  that  in  Chebyshev 
representation  the  eigenvalues  A  =  — i/w  are  nearly  evenly  spaced  in  w  over  a  broad  range. 
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The  smallest  root  iv  corresponds  to  the  largest  A.  For  w  <S  AT,  the  equation  (31)  reduces 
to  arg(ff^‘^(u;))  fa  7r/2  and  we  conclude  that  min(|u;|)  fa  jo.i,  the  first  zero  of  the  Bessel 
function  The  corresponding  max(|A|)  fa  l/jo,i  =  0.415831. 

Since  Qpn  =  0,  A  =  0  is  an  eigenvalue  with  multiplicity  at  least  one.  The  basis  of  the 
inva.ria.Tit  subspace  associated  with  A  =  0  depends  on  the  parity  of  N.  For  odd  AT,  A  =  0  has 
multiplicity  2  and  Q  acts  by  mapping 

Ar(l,0,2,0,2,0,...,2,0)^  Ao  ,  (34) 

while  for  even  AT,  the  zero  eigenvalue  has  multiplicity  three  and  one  obtains  the  sequence 

AT  [Ny2, Q,N^  -  2\0,N^  -  . . .  ,N^  -  [N  -  2)^ 0, o)’’ 

Ar(0,2,0,2,0,...,2,0)'^H^pAfi^0.  (35) 


This  longer  sequence  is  also  responsible  for  a  substantial  increase  in  the  condition  number 
of  the  similarity  transform  S  between  the  Q  in  Chebyshev  representation  and  its  Jordan 
decomposition.  For  example,  k(S')  is  36.3848  for  N  —  63,  increases  to  1.38497  x  10®  for 
N  =  64,  and  goes  back  down  to  23.3846  for  N  =  65.  In  general,  odd  N  produced  better 
conditioned  Jordan  decomposition  in  all  of  the  numerical  tests,  with  k:(5)  RS  0(A^^/2).  We 
shall  pay  particular  attention  to  odd  N  of  the  form  2"^  —  1. 

The  physical  interpretation  of  these  sequences  generated  by  the  nilpotent  part  of  Q 
for  large  N  is  that  polynomieJ  representations  of  functions  whose  first  (and  possibly  second) 
integrals  have  discontinuities  at  the  boundaries  (a:  =  ±1)  are  mapped  to  p^,  which  is  mapped 
to  zero. 

While  this  completes  the  description  of  the  effect  of  Q  in  Chebyshev  representation,  the 
equation  (31)  could  not  be  solved  in  closed  form  for  all  roots  w.  Instead,  the  characteristic 
polynomial  of  Q  was  derived  analytically  in  Mathematica  3.0  for  N  =  32,  confirming  that 
machine  precision  intervals  around  the  numerically  computed  eigenvalues  bracket  the  true 
roots.  Numerical  evidence  suggests  that  all  eigenvalues  of  Q  obtained  from  the  Chebyshev 
polynomial  basis  lie  on  the  imaginary  axis  and  that  |A|  <  0.415831.  This  inequality  is  in 
excellent  agreement  with  the  analytic  approximation  (within  machine  precision  for  N  > 
12).  Similarly,  eigenvalues  of  Q  arising  from  the  Legendre  polynomial  basis  are  also  on  the 
imaginary  axis,  with  1A|  <  0.318310. 

Finally,  the  square  of  the  Frobenius  norm  of  Q  in  the  limit  Af  — >■  oc  is  given  by 


1  1 

^  4  ■*'  4(N'  -  1)2 


4Ar2 


N-2 

+  E 

n~2 


1  ^ 

— -  — >■  —  +  -  =  1.57247 
2n2  12  4 


(36) 


for  the  Q  derived  in  the  Chebyshev  representation  and  by 


1  + 


32 


N-l  2 

^S(2n+l)» 


—  =  1.36629 


(37) 


for  the  Q  derived  in  the  Legendre  representation. 
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6  Numerical  results 


Since  Q  is  singular  and  its  eigenvalues  A  lie  on  the  imaginary  axis,  for  all  real  k  the  condition 
number  of  /  —  kQ  is 

k{I  -  kQ)  =  yjl  +  (fcmax(|A|))2  (38) 

which  approaches  |fe|max(|A|)  for  large  \k\.  The  eigenvalues  of  (7  -  kQ)~'^Q  are  of  the 
form  fj,  =  A/(l  -  fcA).  For  real  k  one  obtains  =  |A|®/(1  +  fc^|A|^).  We  conclude  that 
max(  p|)  =  max(|A|).  Both  Chebyshev  and  Legendre  polynomial  bases  have  small  upper 
bounds  on  |A|,  so  that  the  spectral  radius  of  the  proposed  numerical  scheme  is  well  behaved 
for  all  real  k. 

The  accuracy  of  the  proposed  method  can  be  analyzed  as  follows.  Let  ft  represent  the 
true  solution  satisfying  the  boundary  condition  and  let  {D  —  kl)ft  =  g.  When  fc  =  0,  the 
truncation  error  in  representing  /{(s)  corresponds  to  setting  the  last  coefficient  gjv  to  zero. 
Similarly,  the  truncation  error  in  representing  the  exact  homogeneous  solution  fh{x)  is  also 
involved  when  k  The  error  in  determining  fp  needs  to  be  analyzed  next. 

Since  QD  =  I  -  Iln,  and  DQ  =  I  -  Ilpi.  ,  writing  (7  -  kQ)fp  =  Qg  =  {QD  —  kQ)f 
leads  to 

/p  =  (7  kQ)-^iI  kQ  (39) 

SO  that 

ft-fp  =  (I-kQ)-^UpJt  (40) 

and  since  D  —  kl  =  D{I  —  kQ)  —  np±_^ 

(£>  -  kl)fp  ^g  +  kllp^  JI  -  kQ)-^IlpJt.  (41) 


The  last  equation  follows  from  the  observation  that  DUp^  —  0.  Let  /□  =  (po)  Upo/t)  be  the 
constant  function  component  of  ft.  Therefore,  fp  is  the  exact  solution  of  a  modified  problem 
where  a  term  proportional  to  PAr(a5)/o  has  been  added  to  the  right  hand  side. 

When  fc  =  0  or  /o  =  0,  the  method  is  exact  apart  from  the  truncation  error.  When  k 
and  fa  ^  0,  the  additional  term  in  (41)  follows  from  the  expansion  by  minors,  which  is  easily 
evaluated  since  Q  is  tridiagonal  in  representations  of  interest.  For  example,  in  Chebyshev 
representation,  we  obtain 

c(fe,  Ar)p^^(a:)/o  “*=  e(A:,  N]pN{x)fo  (42) 

where  ^  ^ 

=  det(7-fc0)  "  i-k)'^+^XNil/k) 

and  xw(0  is  the  characteristic  polynomial  of  Q.  The  proportionality  constant  is  given  by 


(44) 


and  decreases  rapidly  as  N  oo. 
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Let  us  consider  the  continuous  analogue 


^  ^  J  h[x)dx  =  1  where  J  h[x)dx  L  Pq 


and  the  exact  solution 


fe(a:)  = 


W 


The  Chebyshev  polynomial  representation  of  h(^x)  is 


(46) 

(46) 


(47) 


where  Iv{k)  are  the  modified  Bessel  functions.  For  large  enough  N  the  truncation  error 
becomes  small  and  h;v+i|  <  l^wl  ^  e-  Therefore,  whenever  |fe|  < 


{I-kQ)h  =  l 


hffPM 

2{N-l) 


hj^+iPisf+i 

k - tt; - 1 

2N 


(48) 


so  that  h  is  approximately  the  aolution  of  the  discrete  problem  a.8  well.  We  obtain 

£{k,N)  «  when  |fe|  <  N.  (49) 

fn(fc) 

However,  the  range  of  applicability  of  this  approximation  is  too  narrow  since  the  term 
e(fc,  iV)  may  be  small  even  if  |fc|  »  AT.  The  resolution  criterion  |e(fc,  AT)]  <  e  leads  to 


4  4  /  e  In  +  1 

lXiv(l/*:)l  >  ^AT+i  “  ~  (^2(Ar  +  1)  j  V  27r 


(50) 


where  the  approximation  follows  from  Stirling’s  asymptotic  formula  for  A^!  =  r(A^  +  1) 
applicable  for  AT  »  1.  This  criterion  can  be  simplified  as  follows. 

Clearly  c(0,  AT)  =  1.  For  each  N,  nonzero  eigenvalues  of  Q  occur  in  conjugate  pairs  on 
the  imaginary  axis  and  det(/  —  kQ'j  is  the  product  of  the  terms  of  the  form  1  +  >  1, 

showing  that  |c(S:,  AT)]  ->■  0  as  \k  -¥■  oo.  One  would  expect  that  the  solution  k  to  |e(fe,  A/’)!  <  c 
depends  on  the  ability  to  resolve  the  thin  boundary  layer  near  Xb-  Indeed,  by  curve  fitting 
the  numerical  results  for  A^  =  4,5, ... ,  128  one  obtains  the  criterion  that  machine  precision 
e  =  2.22045  x  10“^®  is  approximately  reached  provided  that  N  >  2.06131  +  8.53338^1  +  |fc|. 

Truncation  error  in  representing  /h(a:)  must  also  be  considered.  The  resolution  require¬ 
ments  are  virtually  identical.  For  Chebyshev  polynomial  basis  eind  |fc|  »  1,  fh{x)  can  be 
represented  to  machine  precision  provided  that  the  last  coefficient  of  fh  (which  is  given  in 
terms  of  the  modified  Bessel  function  /;v('))  satisfies  2/;v(|fc|)e  <  e.  This  leads  to  the 

approximate  criterion  AT  >  2.81178  +  8-05974^1  +  |fe|,  according  to  asymptotic  analysis  and 
curve  fitting  to  numerical  experiments. 

The  simplified  combined  criterion 

N>3  +  9^|fc|  +  l  ^  k\<  -  1  and  AT  >  12  (51) 
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works  well  in  practice  (figure  2).  The  criterion  (51)  has  been  verified  for  12  <  <  256, 

where  |e(it,  N)\  <  6.60738  x  10“^®  and  the  truncation  error  is  even  lower.  For  N  >  256,  the 
resolution  requirements  are  only  slightly  more  stringent,  since  the  worst  case  residual  error 
appears  to  be  slowly  growing  with  N.  A  more  complicated  criterion  of  the  form 

^/|fc|  +  l<|^  +  ^log(Ar)-3  (52) 

fits  the  numerical  results  almost  exactly  for  128  <  AT  <  256  a,nd  may  be  extrapolated  to 
larger  N.  Our  numerical  results  indicate  that  the  criterion  (52)  produces  worst  case  residues 
|e(fc,  N)\  of  less  that  lOOe  at  iV  =  4096  and  less  than  10“^“  at  iV  =  2*^®  =  65536. 

Numerical  testa  in  Cliebyshev  polynomial  representation  were  done  a.t  N  =  16,32,64 
and  k  =  -10,-1,0,1,10  and  compared  to  the  analytically  obtained  integrals  for  g{x)  = 
T„(a;)  where  n  =  1, 2, 4, 8, ... ,  N/2.  Standard  precision  was  used  to  obtain  the  polynomial 
approximation  of  the  solution  but  the  exact  integrals 

/w  = /%«— (53) 

were  obtained  analytically  and  then  evaluated  in  extended  precision  using  Mathematica  3.0, 
which  was  essential  to  capture  the  delicate  cancellations  in  the  exact  solutions.  For  example, 
the  exact  solution  of 

=  (54) 

with  the  boundary  condition  /(I)  =  0  satisfies 

f(0)  =  1523681485112275626378695517688630699203508225/e  - 

560531093266377270849883382873867646780763137  (55) 

which  is  about  —0.00062,  i.e.  48  orders  of  magnitude  less  than  the  terms  involved. 

In  63  of  the  75  tests,  the  L2  and  Lra  relative  norms  of  the  error  at  the  Chebyshev 
collocation  points  were  below  10“^^.  The  remaining  12  results  are  summarized  in  Table  1. 
These  results  match  the  above  analysis.  For  example,  when  |fcl  =  10  and  N  =  16,  the  relative 
truncation  error  in  representing  is  2.73  X  10~®  and  does  not  become  negligible  until 

iV  =  32. 


7  Generalizations 


The  method  presented  here  may  be  generalized  to  higher  order  equations  in  one  variable  of 
the  form 

j 


^D^Ajf  =  g  (56) 


rewritten  as 


j=o 


j=a 


(57) 
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Figure  2:  The  maximum  residual  log^o |e(Aj(Ar).  iV)|  vs.  N  for  the  worst  case  k[N)  according  to 
the  reRolution  criteria.  (51)  [upper  points]  a.nd  (52)  [lower  points].  The  criterion  (51)  gives  residues 
closer  to  c  for  iV  <  80  and  remains  usable  at  somewhat  larger  AT,  but  the  slightly  more  conservative 
criterion  (52)  is  closer  to  achieving  full  precision  for  iV  >  80.  Horizontal  lines  indicate  the  machine 
precision  f.  =  and  the  error  tolerance  set  a.t  100c. 
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N 

k 

n 

ll<S||2/||/||2 

\\SU  /Hoc 

16 

-10 

1 

8.55882  X  10"' 

8.71212  X  10"' 

16 

-10 

2 

1.28628  X  10-® 

2.01382  X  10"® 

16 

-10 

4 

3.03513  X  10  ® 

3.05528  X  10  ® 

16 

-10 

8 

7.85599  X  10-® 

8.82017  X  10"® 

16 

1 

8 

4.96642  X  10"^® 

4.46525  X  10"^® 

16 

1 

8 

4.97129  X  10~^^ 

4.47325  X  10"^® 

16 

10 

1 

8.55882  X  10"^ 

8.71212  X  10"^ 

16 

10 

2 

1.28628  X  10"® 

2.01382  X  10"® 

16 

10 

4 

3.03513  X  10"® 

3.05528  X  10"® 

16 

10 

8 

7.85599  X  10"® 

8.82017  X  10"® 

32 

-10 

16 

1.11479  X  10"^^ 

1.09573  X  10"^“ 

32 

10 

16 

1.11461  X  10"^* 

1.09789  X  10"^^ 

other  63  tests 

<  10"''* 

<  10"''* 

Table  1:  S.elative  error  in  the  L2  and  the  Loo  norm  at  the  Chebyshev  collocation  points  for 
each  resolution. 

so  that  fp  becomes 


Invertibility  depends  on  the  specific  A^.  Furthermore,  J  boundary  conditions  are  to  be  satis¬ 
fied  by  adding  a  linear  combination  of  J  precomputed  homogeneous  solutions.  Alternatively, 
problems  of  the  form 

{[iD-A^)f  =  g  (59) 

can  be  converted  into  a  sequence  of  first  order  problems 

f,i  =  {I-QA^)-^Qgi  (60) 

where  9i  =  g,  fj  =  f  and  Qj  =  fj-i  for  j  =  2, . . . ,  J.  In  addition  to  invertibility  require¬ 
ments,  this  approach  requires  precomputing  the  homogeneous  solutions  of  the  J  subproblems. 
Boundary  conditions  on  each  fj  ~  fpj  +  ajfhj  must  also  be  available. 

In  multidimensional  domains,  the  number  of  homogeneous  solutions  is  proportional  to 
the  number  of  boundary  points.  Therefore,  the  proposed  method  is  most  beneficial  in  low 
dimensional  problems  where  the  differential  operators  have  a  favorable  structure.  However, 
some  multidimensional  problems  may  be  factored  into  a  sequence  of  one  dimensional  prob¬ 
lems,  for  which  the  proposed  method  can  be  very  effective.  In  fact,  this  investigation  was 
motivated  by  a  factorization  of  the  2D  Laplacian.  This  example  shows  that  non-normal 
operators  of  type  (9)  arise  naturally  in  solving  partial  differential  equations  as  well. 
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8  Conclusion 


The  advantages  of  the  proposed  method  include  numerical  stability,  high  accuracy  and  ef¬ 
ficiency.  The  pseudo-inverse  of  the  derivative  operator  has  a  simple  tridiagonal  form  in 
commonly  used  polynomial  representations  and  /  —  hQ  is  easily  inverted  for  real  k.  The 
boundary  condition  is  applied  afterwards.  By  contrast,  the  Lanczos  tau  method  applies 
the  boundary  condition  by  introducing  a  full  row  into  the  matrix  to  be  inverted,  and  thus 
increases  the  computational  effort  by  a  factor  of  0(Ar). 

Our  analysis  and  numerical  experiments  indicate  that  the  pseudo-inverse  of  D  and  delayed 
application  of  boundary  conditions  are  most  useful  in  one-dimensional  problems  and  low 
multidimensional  problems  which  may  be  factored  into  one-dimensional  subproblems. 
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